157 research outputs found

    GeoAnnotator: A Collaborative Semi-Automatic Platform for Constructing Geo-Annotated Text Corpora

    Get PDF
    Ground-truth datasets are essential for the training and evaluation of any automated algorithm. As such, gold-standard annotated corpora underlie most advances in natural language processing (NLP). However, only a few relatively small (geo-)annotated datasets are available for geoparsing, i.e., the automatic recognition and geolocation of place references in unstructured text. The creation of geoparsing corpora that include both the recognition of place names in text and matching of those names to toponyms in a geographic gazetteer (a process we call geo-annotation), is a laborious, time-consuming and expensive task. The field lacks efficient geo-annotation tools to support corpus building and lacks design guidelines for the development of such tools. Here, we present the iterative design of GeoAnnotator, a web-based, semi-automatic and collaborative visual analytics platform for geo-annotation. GeoAnnotator facilitates collaborative, multi-annotator creation of large corpora of geo-annotated text by generating computationally-generated pre-annotations that can be improved by human-annotator users. The resulting corpora can be used in improving and benchmarking geoparsing algorithms as well as various other spatial language-related methods. Further, the iterative design process and the resulting design decisions can be used in annotation platforms tailored for other application domains of NLP

    GeoLinter: A Linting Framework for Choropleth Maps

    Full text link
    Visualization linting is a proven effective tool in assisting users to follow established visualization guidelines. Despite its success, visualization linting for choropleth maps, one of the most popular visualizations on the internet, has yet to be investigated. In this paper, we present GeoLinter, a linting framework for choropleth maps that assists in creating accurate and robust maps. Based on a set of design guidelines and metrics drawing upon a collection of best practices from the cartographic literature, GeoLinter detects potentially suboptimal design decisions and provides further recommendations on design improvement with explanations at each step of the design process. We perform a validation study to evaluate the proposed framework's functionality with respect to identifying and fixing errors and apply its results to improve the robustness of GeoLinter. Finally, we demonstrate the effectiveness of the GeoLinter - validated through empirical studies - by applying it to a series of case studies using real-world datasets.Comment: to appear in IEEE Transactions on Visualization and Computer Graphic

    Visually-Enabled Active Deep Learning for (Geo) Text and Image Classification: A Review

    Get PDF
    This paper investigates recent research on active learning for (geo) text and image classification, with an emphasis on methods that combine visual analytics and/or deep learning. Deep learning has attracted substantial attention across many domains of science and practice, because it can find intricate patterns in big data; but successful application of the methods requires a big set of labeled data. Active learning, which has the potential to address the data labeling challenge, has already had success in geospatial applications such as trajectory classification from movement data and (geo) text and image classification. This review is intended to be particularly relevant for extension of these methods to GISience, to support work in domains such as geographic information retrieval from text and image repositories, interpretation of spatial language, and related geo-semantics challenges. Specifically, to provide a structure for leveraging recent advances, we group the relevant work into five categories: active learning, visual analytics, active learning with visual analytics, active deep learning, plus GIScience and Remote Sensing (RS) using active learning and active deep learning. Each category is exemplified by recent influential work. Based on this framing and our systematic review of key research, we then discuss some of the main challenges of integrating active learning with visual analytics and deep learning, and point out research opportunities from technical and application perspectives-for application-based opportunities, with emphasis on those that address big data with geospatial components

    A Geovisual Analytic Approach to Understanding Geo-Social Relationships in the International Trade Network

    Get PDF
    The world has become a complex set of geo-social systems interconnected by networks, including transportation networks, telecommunications, and the internet. Understanding the interactions between spatial and social relationships within such geo-social systems is a challenge. This research aims to address this challenge through the framework of geovisual analytics. We present the GeoSocialApp which implements traditional network analysis methods in the context of explicitly spatial and social representations. We then apply it to an exploration of international trade networks in terms of the complex interactions between spatial and social relationships. This exploration using the GeoSocialApp helps us develop a two-part hypothesis: international trade network clusters with structural equivalence are strongly ‘balkanized’ (fragmented) according to the geography of trading partners, and the geographical distance weighted by population within each network cluster has a positive relationship with the development level of countries. In addition to demonstrating the potential of visual analytics to provide insight concerning complex geo-social relationships at a global scale, the research also addresses the challenge of validating insights derived through interactive geovisual analytics. We develop two indicators to quantify the observed patterns, and then use a Monte-Carlo approach to support the hypothesis developed above

    HEALTH GeoJunction: place-time-concept browsing of health publications

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The volume of health science publications is escalating rapidly. Thus, keeping up with developments is becoming harder as is the task of finding important cross-domain connections. When geographic location is a relevant component of research reported in publications, these tasks are more difficult because standard search and indexing facilities have limited or no ability to identify geographic foci in documents. This paper introduces <it><smcaps>HEALTH</smcaps> GeoJunction</it>, a web application that supports researchers in the task of quickly finding scientific publications that are relevant geographically and temporally as well as thematically.</p> <p>Results</p> <p><it><smcaps>HEALTH</smcaps> GeoJunction </it>is a geovisual analytics-enabled web application providing: (a) web services using computational reasoning methods to extract place-time-concept information from bibliographic data for documents and (b) visually-enabled place-time-concept query, filtering, and contextualizing tools that apply to both the documents and their extracted content. This paper focuses specifically on strategies for visually-enabled, iterative, facet-like, place-time-concept filtering that allows analysts to quickly drill down to scientific findings of interest in PubMed abstracts and to explore relations among abstracts and extracted concepts in place and time. The approach enables analysts to: find publications without knowing all relevant query parameters, recognize unanticipated geographic relations within and among documents in multiple health domains, identify the thematic emphasis of research targeting particular places, notice changes in concepts over time, and notice changes in places where concepts are emphasized.</p> <p>Conclusions</p> <p>PubMed is a database of over 19 million biomedical abstracts and citations maintained by the National Center for Biotechnology Information; achieving quick filtering is an important contribution due to the database size. Including geography in filters is important due to rapidly escalating attention to geographic factors in public health. The implementation of mechanisms for iterative place-time-concept filtering makes it possible to narrow searches efficiently and quickly from thousands of documents to a small subset that meet place-time-concept constraints. Support for a <it>more-like-this </it>query creates the potential to identify unexpected connections across diverse areas of research. Multi-view visualization methods support understanding of the place, time, and concept components of document collections and enable comparison of filtered query results to the full set of publications.</p

    A Collaborative Process for Developing Map Symbol Standards

    Get PDF
    AbstractGeographic information is commonly disseminated and consumed via visual representations of features and their environmental context on maps. Map design inherently involves generalizing reality, and one method by which mapmakers do so is through the use of symbols to represent features. Here we focus on the challenges associated with supporting mapmakers who need to work together to reach consensus on standardizing their map symbols. Based on a needs assessment study with mapmakers at the U.S. Department of Homeland Security, we designed a new, mixed-method symbol standardization process that takes place through a web-based, asynchronous platform. A study to test this new standardization process with mapmakers at DHS revealed that our process allowed participants to identify many issues related to symbol design, meaning, and categorization. The approach elicited sustained, iterative engagement and critical thinking from participants, and results from a post-study survey indicate that participants found it to be useful and usable. Results from our study and user feedback allow us to suggest multiple ways in which our approach and platform can be improved for future applications

    Distributed usability evaluation of the Pennsylvania Cancer Atlas

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Pennsylvania Cancer Atlas (PA-CA) is an interactive online atlas to help policy-makers, program managers, and epidemiologists with tasks related to cancer prevention and control. The PA-CA includes maps, graphs, tables, that are dynamically linked to support data exploration and decision-making with spatio-temporal cancer data. Our Atlas development process follows a user-centered design approach. To assess the usability of the initial versions of the PA-CA, we developed and applied a novel strategy for soliciting user feedback through multiple distributed focus groups and surveys. Our process of acquiring user feedback leverages an online web application (e-Delphi). In this paper we describe the PA-CA, detail how we have adapted e-Delphi web application to support usability and utility evaluation of the PA-CA, and present the results of our evaluation.</p> <p>Results</p> <p>We report results from four sets of users. Each group provided structured individual and group assessments of the PA-CA as well as input on the kinds of users and applications for which it is best suited. Overall reactions to the PA-CA are quite positive. Participants did, however, provide a range of useful suggestions. Key suggestions focused on improving interaction functions, enhancing methods of temporal analysis, addressing data issues, and providing additional data displays and help functions. These suggestions were incorporated in each design and implementation iteration for the PA-CA and used to inform a set of web-atlas design principles.</p> <p>Conclusion</p> <p>For the Atlas, we find that a design that utilizes linked map, graph, and table views is understandable to and perceived to be useful by the target audience of cancer prevention and control professionals. However, it is clear that considerable variation in experience using maps and graphics exists and for those with less experience, integrated tutorials and help features are needed. In relation to our usability assessment strategy, we find that our distributed, web-based method for soliciting user input is generally effective. Advantages include the ability to gather information from users distributed in time and space and the relative anonymity of the participants while disadvantages include less control over when and how often participants provide input and challenges for obtaining rich input.</p

    Geovisual analytics to enhance spatial scan statistic interpretation: an analysis of U.S. cervical cancer mortality

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Kulldorff's spatial scan statistic and its software implementation – SaTScan – are widely used for detecting and evaluating geographic clusters. However, two issues make using the method and interpreting its results non-trivial: (1) the method lacks cartographic support for understanding the clusters in geographic context and (2) results from the method are sensitive to parameter choices related to cluster scaling (abbreviated as scaling parameters), but the system provides no direct support for making these choices. We employ both established and novel geovisual analytics methods to address these issues and to enhance the interpretation of SaTScan results. We demonstrate our geovisual analytics approach in a case study analysis of cervical cancer mortality in the U.S.</p> <p>Results</p> <p>We address the first issue by providing an interactive visual interface to support the interpretation of SaTScan results. Our research to address the second issue prompted a broader discussion about the sensitivity of SaTScan results to parameter choices. Sensitivity has two components: (1) the method can identify clusters that, while being statistically significant, have heterogeneous contents comprised of both high-risk and low-risk locations and (2) the method can identify clusters that are unstable in location and size as the spatial scan scaling parameter is varied. To investigate cluster result stability, we conducted multiple SaTScan runs with systematically selected parameters. The results, when scanning a large spatial dataset (e.g., U.S. data aggregated by county), demonstrate that no single spatial scan scaling value is known to be optimal to identify clusters that exist at different scales; instead, multiple scans that vary the parameters are necessary. We introduce a novel method of measuring and visualizing reliability that facilitates identification of homogeneous clusters that are stable across analysis scales. Finally, we propose a logical approach to proceed through the analysis of SaTScan results.</p> <p>Conclusion</p> <p>The geovisual analytics approach described in this manuscript facilitates the interpretation of spatial cluster detection methods by providing cartographic representation of SaTScan results and by providing visualization methods and tools that support selection of SaTScan parameters. Our methods distinguish between heterogeneous and homogeneous clusters and assess the stability of clusters across analytic scales.</p> <p>Method</p> <p>We analyzed the cervical cancer mortality data for the United States aggregated by county between 2000 and 2004. We ran SaTScan on the dataset fifty times with different parameter choices. Our geovisual analytics approach couples SaTScan with our visual analytic platform, allowing users to interactively explore and compare SaTScan results produced by different parameter choices. The Standardized Mortality Ratio and reliability scores are visualized for all the counties to identify stable, homogeneous clusters. We evaluated our analysis result by comparing it to that produced by other independent techniques including the Empirical Bayes Smoothing and Kafadar spatial smoother methods. The geovisual analytics approach introduced here is developed and implemented in our Java-based Visual Inquiry Toolkit.</p

    GeoCAM: A geovisual analytics workspace to contextualize and interpret statements about movement

    Get PDF
    This article focuses on integrating computational and visual methods in a system that supports analysts to identify extract map and relate linguistic accounts of movement. We address two objectives: (1) build the conceptual theoretical and empirical framework needed to represent and interpret human-generated directions; and (2) design and implement a geovisual analytics workspace for direction document analysis. We have built a set of geo-enabled computational methods to identify documents containing movement statements and a visual analytics environment that uses natural language processing methods iteratively with geographic database support to extract interpret and map geographic movement references in context. Additionally analysts can provide feedback to improve computational results. To demonstrate the value of this integrative approach we have realized a proof-of-concept implementation focusing on identifying and processing documents that contain human-generated route directions. Using our visual analytic interface an analyst can explore the results provide feedback to improve those results pose queries against a database of route directions and interactively represent the route on a map
    • …
    corecore